Sensors 2014, 14, 11308-11350; doi:10.3390/s 1407 11308 



OPEN ACCESS 



sensors 

ISSN 1424-8220 

www.mdpi.com/journal/sensors 

Article 

A Compact Methodology to Understand, Evaluate, and Predict 
the Performance of Automatic Target Recognition 

Yanpeng Li Xiang Li Hongqiang Wang Yiping Chen l , Zhao wen Zhuang 1 , 
Yongqiang Cheng x , Bin Deng 1 , Liandong Wang 2 , Yonghu Zeng 2 and Lei Gao 2 

1 School of Electrical Science and Engineering, National University of Defense Technology, 
137 Yanwachi Street, Changsha 410073, China; E-Mails: lixiangOl @vip. sina.com (X.L.); 
oliverwhq@tom.com (H.W.); ypchenhk@gmail.com (Y.C.); zwzhuang@nudt.edu.cn (Z.Z.); 
nudtyqcheng@gmail.com (Y.C.); dengbin@nudt.edu.cn (B.D.) 

2 State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and 
Information System (CEMEE), Luoyang 471003, China; E-Mails: CEMEE@vip.163.com (L.W.); 
lotus2seeds@gmail.com (Y.Z.); ren_lgao@ 126.com (L.G.) 

* Author to whom correspondence should be addressed; E-Mail: liyanpeng@nudt.edu.cn; 
Tel.: +86-134-676-955-11; Fax: +86-731-845-187-30. 

Received: 5 December 2013; in revised form: 23 May 2014 / Accepted: 9 June 2014 / 
Published: 25 June 2014 



Abstract: This paper offers a compacted mechanism to carry out the performance evaluation 
work for an automatic target recognition (ATR) system: (a) a standard description of the 
ATR system's output is suggested, a quantity to indicate the operating condition is presented 
based on the principle of feature extraction in pattern recognition, and a series of indexes 
to assess the output in different aspects are developed with the application of statistics; 
(b) performance of the ATR system is interpreted by a quality factor based on knowledge of 
engineering mathematics; (c) through a novel utility called "context-probability" estimation 
proposed based on probability, performance prediction for an ATR system is realized. The 
simulation result shows that the performance of an ATR system can be accounted for 
and forecasted by the above-mentioned measures. Compared to existing technologies, the 
novel method can offer more objective performance conclusions for an ATR system. These 
conclusions may be helpful in knowing the practical capability of the tested ATR system. At 
the same time, the generalization performance of the proposed method is good. 
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Sensors 2014, 14 



11309 



1. Introduction 

1.1. The ATR Technology and Performance Analysis for It 

Automatic target recognition (ATR) is the capability for an algorithm or equipment to recognize 
targets or objects based on the data obtained from sensors [1,2]. ATR is an essential component of 
intelligence systems implemented with various types of sensors [1,3]. Therefore, it is of great importance 
to have an objective and quantitative performance evaluation measure for an ATR system [1]. 

ATR technology is widely employed as the essential technique in advanced systems such as within 
the military [4,5], security [6,7] and modern medical science [8]. It enables a radar to catch its object 
of interest [9,10], helps a seeker find the fixed target in a complicated scenario [3,11], and makes the 
accurate diagnosis possible with medical sensors [12,13]. Nowadays, ATR is frequently typified by the 
application of radars and optical- sensors [4]. 

The primary principle in ATR is inverse theory, which, enlightened by the feeding signal collected 
with certain types of sensors, makes decisions on the information related to the intended target [3,9] (see 
"ATR system" in figure of Section 2.1). For example, people may know that ATR can be viewed as an 
inverse problem in the fields of electromagnets and acoustics: targets of interest are sensed, the sensed 
signatures are then transmitted to the detectors, and the main purpose of ATR is to use these signatures 
to classify the original targets [1,5,14]. 

As for the components of an ATR system, the employed sensor can be a polarimetric infrared, 
a hyperspectral device, or an ultra- wide band radar [3]. Many kinds of classifiers are investigated, 
such as model-based classifier, statistical based classifier, phenomenological modeling classifier, context 
information based classifier and information fusion classifier [9]. With the rapid advances in sensor 
technology, flexible field programmable gate array (FPGA), high performance computer, the art of ATR 
is becoming more pertinent to a much wider group of scientist and engineers than before [1,14]. As we 
proceed into the future, there will be more and more research/applications of ATR technologies and ATR 
systems [2]. 

However, given the changing environment and the limitation of the sensors, this system sometimes 
runs into trouble [15]. For example, the same kind of cell and the diseased tissue being observed may 
vary in shape, size and even quality [16,17], the vehicles being investigated may shift in velocity, pitch 
and direction [6], a certain type of sensor can only collect limited information from the target [3], and 
this point is further complicated by the fact that so many systems and factors are involved in the signal 
processing course of an ATR system [11]. 

Given the facts mentioned previously, the performance evaluation for ATR systems (PE-ATR) and 
the performance prediction for ATR systems (PP-ATR) continue to be studied by many experts in the 
field [18,19]. As aforementioned, the application of ATR in radars and photo-sensors is frequently found. 
Consequently, the literature on performance evaluation in those ATR systems maintains the major part 
in this area [20,21]. There are evaluation technologies for ATR in radar systems [22,23], performance 
assessment work on ATR algorithms employing motion imagery [24], performance prediction testbed 
for an image-based ATR algorithm [20], etc. When reviewing the available technologies, most of the 
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literatures are from the radar and photo-sensor related discipline. Typical performance evaluation and 
performance prediction technologies for an ATR system are as follows. 

1.2. Scaling the Operating Condition for the ATR System 

ATR systems operate and are tested under certain conditions. These conditions may be regarded as 
subsets of a multi-dimensional space of conditions [25]. To obtain an objective conclusion, the operating 
condition should be considered in the performance evaluation course [9]. However, the scenarios of the 
ATR system are many and varied [26]. It is of great importance to develop some approaches to scale 
the operating condition (throughout this paper, unless otherwise stated, "operating condition" is all the 
scenarios an ATR system can be applied) for an ATR system [27]. Unfortunately, the literature on scaling 
the operating condition for an ATR system is limited. Available works are as follows: 

• Operating condition description with concepts. Generally, there are four sets of conditions: 
operating (here, "operating condition" is only the operational condition), testing, modeled, and 
training [25]. The relation of them can be shown in a Venn diagram [25]. Although this approach 
is not a quantified way, it is helpful in discussing the performance of an ATR system. 

• For image metric-based ATR and synthetic aperture radar (SAR) ATR, the operating condition 
is sometimes quantified in the way of image characterization [28]. As a fundamental idea, the 
concept of "Extended Operating Condition (EOC)" is defined [27,29]. EOC is an operating 
condition "away" from the trained condition [27]. Experiments shown the tested SAR ATR 
performance (recognition rate) was very sensitive to the EOCs tested [27,30]. 

• With respect to the condition in signal processing in ATR, amplitude affection factor (AAF) and 
signal to noise ratio affect factor (SNRF) are developed [9]. In view of the condition of feature 
extraction, extend of recognition (ER) is proposed [9] . These metrics are further applied in defining 
performance evaluation indexes and building performance evaluation models [9] . 

1.3. Performance Evaluation for ATR Systems 

As mentioned above , the performance evaluation for ATR systems is studied by many experts. The 
available technologies can be divided into two main groups according to the framework: model-based 
and data-based approaches [9]. When estimating the performance for an ATR system, model-based 
approaches usually work with a performance model such as the expected measures of effectiveness 
(MOE), robust evaluation model and independence evaluation model [9,19]. The data-based approaches 
directly calculate some indexes from the recognition output such as recognition rate and false alarm [21]. 
In practice, these two approaches are often combined in performance evaluating for an ATR system. 

1.3.1. Basic Performance Evaluation Indexes 

In PE-ATR, some basic measures like probability of detection (PdX recognition rate (RR), and false 
alarm rate (Pfa) are generally employed facilities [9]. Estimating the performance bound is concerned 
in the early years [31]. 
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Performance concepts are also introduced to compare performance across ATR technologies [25]. 
Two classes of concepts are proposed [25]. One class is referred to as performance. It includes accuracy, 
extensibility, robustness, and utility [25]. These Performance concepts consider the relationship between 
the test data, the training data, and data from modeled conditions [25]. The other class is called cost. 
It includes efficiency, scalability, and synthetic trainability [25]. The latter class of concepts put the cost 
into three categories: data-collection, data storage, and data-processing [25]. 

The confusion matrix (CM) is another widely used performance evaluation approach [19,32]. It can 
be easily configured and employed for a diverse set of ATR systems. The matrix is a square grid with a 
single row and a single column corresponding to each category defined in the data set. The cell in 
the matrix is the number of predicted classifications on category j that correspond to the truth source of 
category i [19]. 

It should be noted that, in some related works, the quantities to show the performance of an ATR 
system are all referred to as "performance metric" or "character of performance". However, they are 
referred to as "performance indexes" in this work hereafter. 

1.3.2. Performance Evaluation Based on Performance Modeling 

In PE-ATR, many scientists work with performance models and/or evaluation models [33]. The 
existing performance models and evaluation models can be classified as: (a) models based on probability, 
statistics, and random processes [9,34]; (b) models based on Bayesian approach [35]; (c) models based 
on information theory approach [35]; (d) subsystem performance models [36]; (e) other performance 
models [37]. 

(a) Models Based on Probability, Statistics, and Random Processes 

Series of performance indexes are built based on probability, statistics, and random processes: 
measurement of recognition rate (MRR), measurement of false recognition rate (MFRR), mean of MRR, 
variance of MRR, the independence of MRR to operating condition, etc. [9]. 

(b) Models Based on Bayesian Approach 

As for the Bayesian approach, probability distributions are used to represent the variability in target 
and background signatures [35]. To apply the method, assumptions are usually made (such as the use 
of Gaussian distributions and independence of information sources) to ensure mathematical tractability. 
However, these assumptions are not always practical enough [35]. 

(c) Models Based on Information Theory Approach 

This kind of model casts the recognition problem as a communication process [35]. Information 
theory brings in the notation of entropy and measures of relative information to try to figure out how 
information and thus performance is lost along the processing course [35]. It may suffer from the 
problem when assumptions do not match reality closely enough. This kind of performance indexes 
have been applied in evaluating SAR ATR [38]. 



Sensors 2014, 14 



11312 



(d) Subsystem Performance Models or Performance Model for Certain Metric 

The computational burden is an important metric for image recognition [16]. It is further considered 
for image recognition of high resolution radar sensors [16]. 

For SAR ATR, polarization and resolution may affect the performance [36,39]. This can be studied 
with the help performance curves (probability of detection to false alarms) [36]. For some ATR 
algorithms, performance curves at all three ATR stages (detection, discrimination, and classification) 
for certain combination of polarization and resolution were studied by the Lincoln Laboratory [36,39]. 

Performance evaluation of the subsystem of an ATR system is meaningful. The reliability analysis of 
the sensor employed in ATR is of interest [40]. Performance indexes are built on two fundamental 
issues: reasonable dissimilarity among evidences, and adaptive combination of static and dynamic 
discounting [40]. These measures are helpful to optimize the mentioned ATR algorithm [40]. 

(e) Other Performance Models 

To study the potentialities of polarimetric SAR interferometry (POLInSAR) in developing a 
new classification methods for ships, performance evaluation has been performed to accomplish a 
trade-off between geometry description accuracy and method robustness in reference feature vectors 
(or patterns) [37]. Experiments showed a low number of vectors could lead to an overestimation of 
the classification rate, and an excessive number of patterns would make quite similar geometries to be 
classified in different classes [37]. 

1.3.3. Receiver Operating Characteristic Analysis and Similar Approaches 

Receiver operating characteristic (ROC) analysis is a broadly used performance analysis tool in 
signal processing and communications [34,41]. Researchers have introduced this notation into PE-ATR. 
A three-dimensional (3-D) ROC trajectory was presented to compare competing target recognition 
algorithms when unknown targets are present in the data [34]. In understanding the tradeoffs between 
the probability of rejection and other two performance measures commonly used in detection problems, 
it is a useful tool for SAR image analysis [34,42]. 

Scientists also extended the conventional ROC analysis from single-signal detection to detection and 
classification of multiple signals [41,43]. Applications showed it was a flexible utility in PE-ATR [41]. 

An extension of the ROC method is the analysis of performance bounds in different scenarios [15]. 
Some analytical characters on PE-ATR are obtained under complicated, non-Gaussian models and 
optimized system parameters [15]. For targets composed of a constellation of geometrically-simple 
reflectors, lower and upper bounds on the probability of correct classification are estimated in SAR 
ATR [44,45]. In performance evaluation for sidescan sonar target classification, some common bounds 
are derived to show the properties of ATR [46]. In pose estimation related to ATR, Hilbert- Schmidt 
lower bounds for estimators on matrix Lie groups is defined and validated [47]. 

Another extension of ROC method is confidence intervals for ATR performance evaluation 
index [48,49]. The provided confidence interval estimator includes proportion estimation based on 
Binomial distribution and rate estimation based on Poisson distribution. Under the Bayesian posterior 
distribution, this estimator is substantially more accurate than other similar approaches [48]. 
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Automatic fingerprint recognition is a interdisciplinary field. It includes image processing, pattern 
recognition, computer technology, and so on. The confidence interval is compared between different 
automatic fingerprint recognition algorithms [50]. A performance model is built based on statistics. 
It can be applied to estimate the uniqueness of the template in classifiers [50]. 

1.3.4. Performance Evaluation Framework 

Performance evaluation indexes assess the capability in various aspects. However, people sometimes 
seek an integrated conclusion in some different sides [21]. Therefore, performance evaluation framework 
is concerned and investigated. Generalized performance model is built based on fuzzy comprehensive 
evaluation, fuzzy integration and fuzzy cluster analysis [9]. These performance models can offer an 
algorithm-independent view of the ATR performance [21]. 

1.3.5. Other Performance Evaluation Methods 

Underwater target recognition is challenging due to the presence of noise, point-spread function 
effects resulting from camera or media inhomogeneities [51]. Image compression transform is 
sometimes applied. Performance evaluation method of data compression transforms is then developed 
to achieve low-distortion images that eases the burden of classifiers [51]. 

For automatic face recognition systems, the effect of racial and gender demographics on estimating 
the accuracy of algorithms is considered [52]. It was reported that differences in the match threshold was 
required to obtain a false alarm rate of 0.001 when demographic controls on the non-matched identity 
(race or gender) pairs varied [52] . 

1.3.6. Performance Evaluation System or Performance Evaluation Testbed 

As for PE-ATR software or a testbed, an example is given where Python (an open source scripting 
language) and OpenEV (a viewing and analysis tool) have been incorporated [53]. This testbed gives 
important insight into the risks as well as the successful use of open source language in ATR [53]. 

An experimental system called automated instrumentation and evaluation (Auto-I) is developed [32]. 
Auto-I has a module for automatic adaptation of algorithms parameters using algorithms performance 
models, optimization and artificial intelligence techniques [32]. The presented design of Auto-I is 
modular, it can be interfaced to other ATR systems except for the ATR system in [32]. 

For image-based target detection, a complete truthing system is developed [54]. It is named 
"the Scoring, Truthing, And Registration Toolkit (START) [54]". This toolkit can align the images 
of the identical scene to a common reference frame. Then, "truthing" is applied to specify target identity, 
position, orientation, and other scene characteristics [54]. Finally, "scoring" is used to evaluate the 
performance of certain algorithms as compared to the specified truth [54]. 
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1.4. Performance Prediction for ATR Systems 

Compared to performance evaluation work, the existing performance prediction methods are fairly 
limited [20]. To some extend, the available performance prediction methods are extending work 
of PE-ATR. 

1.4.1. Basic Performance Prediction Approaches for ATR Systems 

Based on image measures quantifying the intrinsic difficulty of ATR, a performance forecaster is 
developed [20]. The performance measures include: constant false alarm rate (CFAR), power spectrum 
signature, probability of edge, etc. This algorithm offers a method for predicting ATR performance based 
on information extracted directly from the imagery [20]. The statistical accuracy is another basic index 
in performance predicting [55]. 

A generally employed performance prediction index is performance bound, namely, upper bound [56] 
and lower bound [21]. In this approach, the frequently considered performance include: detection rate, 
false alarm rate and recognition rate. 

1.4.2. Performance Prediction Based on Performance Modeling 

When predicting the performance for an ATR system, performance models are widely employed [20]. 
Simple models are easy to configure, but they cannot accurately quantify performance [57]. 
Detailed models may freely respond to the scenario, however, the detailed models are difficult to 
investigate [35,57]. 

When the features are distorted by uncertainty (occlusion and/or clutter) in both feature locations 
and magnitudes, the performance of an ATR system is especially difficult to predict. A practical way 
is to estimate the performance bound for the system [57]. For a vote-based object recognition system, 
forecasting lower and upper bound recognition ability is implemented [57]. This approach takes object 
model similarity into account, so that when models of objects are more similar to each other, then the 
desired recognition rate is lower [57]. 

The parameters of ATR algorithms can be used for predicting the performance for an ATR 
system [58]. The levels of robustness and invariance of parameters are employed as predictive indicators 
of ATR performance along with self refusal capabilities of the ATR algorithms [58]. 

A model of the subsystem of an ATR system can be introduced in forecasting the performance for 
the system [59]. One of the methods models the capability of the classifier. The classifier is based 
on a Bayes match between vector of extracted scattering features and a vector of predicted features. 
Uncertainty in both extracted and predicted features are included in the match metric (evaluation 
index) [59]. With scattering centers extracted from measured SAR imagery of ten targets, experiments 
show that the proposed match metric (evaluation index) is helpful in predicting the performance for an 
ATR system [59]. 

To estimate and predict the computational error of an ATR system, scientists developed error 
probability distribution method [60,61]. It is resolved from error function that is derived from the 
parse tree which represents a given ATR algorithm [60,61]. Field tests of performance prediction were 
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performed in terms of computational accuracy, cost, and portability. The results show the prediction is 
reasonable [60,61]. 

Algorithm-independent predicting of the ATR performance is highly welcomed. To facilitate 
evaluation of performance tradeoffs for SAR designs, performance predictions are performed including 
both parameter selections (e.g., bandwidth and transmit power) and added domains of SAR observation, 
such as 3-D, full polarimetry, aspect diversity, and/or frequency diversity [62]. Discussion is made about 
performance of 3-D SAR includes parameter tradeoffs of various height resolutions at the target, and 
various numbers of sensors [62]. This work is significant in supporting SAR ATR designation. 

1.4.3. Other Performance Prediction Method 

To optimize the speech recognition performance in a computer assisted language learning system, 
a decision tree -based method is incorporated to predict possible speaking errors made by non-native 
speakers [63]. Trials of the language learning system and the performance prediction were 
conducted [63]. Positive feedback was reported [63]. 

The confidence interval is compared between different automatic fingerprint verification 
algorithms [50]. A performance model is built based on statistics. It can be applied to estimate the 
uniqueness of the template in classifiers [50]. 

1 .4.4. Performance Prediction System or Performance Prediction Testbed 

The afore-mentioned image measures (CFAR, power spectrum signature and probability of edge) 
are applied in a software which is implemented to validate the performance of some infrared (IR) 
image-based ATR algorithms [20]. For an imagery automatic target detection (ATD) system, these 
metrics are also employed in a software tool developed at Los Alamos National Laboratory [64]. 
A prototype software is developed to reveal the computational error of an ATR system [60,61]. 

1.5. Limitations of the Available Approaches on Performance Evaluation and Performance Prediction 
for an ATR System 

Based on the materials presented above, the time-line of the evolution in PE-ATR is summarized 
in Figure 1. In the performance evaluation and the performance prediction work for an ATR system, 
the aforementioned methods offer choices for us. However, there are still remarkable weaknesses in 
this area: 

First of all, in the calculating course, most of the performance evaluation and the performance 
prediction approaches have not taken the operating condition into account. As a result, the performance 
evaluation and the performance prediction output may lack of objectiveness [19]. 

Secondly, the performance evaluation methods available can not work flexibly and no general 
reference frame has yet been built [22,41]. Furthermore, some of the performance evaluation indexes are 
too simple to reveal the problem- solving capability of an ATR system [65]. 

In addition, there are few perfected performance prediction tools that can be used to field test at 
present [66]. 
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Therefore, in PE-ATR and PP-ATR, sound methodologies that are flexible to the scenario while 
exercising objectiveness are key topics [3]. 

Figure 1. The time-line of the evolution in PE-ATR & PP-ATR. 
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1.6. Designation Objective of this Work and Its Layout 

The contribution of this paper includes: (a) a measure to scale the operating condition for ATR; 
(b) the definition of performance evaluation indexes; (c) the construction of performance evaluating 
and performance predicting function. As a result, a novel approach is developed for the performance 
evaluation work in ATR. Compared to the existing methods, this approach is compacted, scenario 
adapting and easy to configure. In the evaluation or prediction course, this novel approach takes the 
operating condition into account, an objective conclusion may be arrived at. 

In organizing this paper, the problem and its background are analyzed firstly. The key ideas of this 
work are explained. These are the main contents in Section 1 . The rest of the data is organized as follows: 

• The majority of our work concerns the performance evaluation and performance evaluation work 
in ATR. This is further detailed in Section 2. 

The general idea of this methodology is summarized in Section 2.1 . 

In Section 2.2, some similar technics related to ATR is identified and the ATR system's output is 
classified. The sample size in various experiments is resolved. 
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To offer an objective evaluation conclusion, ATR system's condition is scaled in Section 2.3. The 
proposed index is enlightened by the measures of similarity in pattern recognition. 
In Sections 2.4 and 2.5, the performance evaluation work is implemented with performance 
evaluation indexes and an evaluation function. The proposed performance evaluation indexes are 
built based on the probability and mathematical statistics. The most important principles are the 
tests of statistical hypothesis: the hypothesis test of distribution specialty and the hypothesis test 
of independence. 

In Section 2.6, the performance predicting is realized with a generalized function. Based on the 
idea in expert prediction (EP, a branch of machine learning), the proposed performance predicting 
approach is built. 

• To confirm the practicability of this work, experiments are implemented in Section 3. The ATR 
algorithms setup and the data are explained in Section 3.1. Simulation results and the analysis of 
them are shown in Sections 3.2-3.4. Comprehensive topics related to this work are discussed in 
Section 3.5. 

• In Section 4, a summary is provided and the future topics are suggested. 

In view of the proposed indexes, this work spans a number of scientific disciplines, and there are many 
references concerning those topics, though the related scientific background has not been presented in 
the text. However, the scientific background is figured out for each proposed index. 

2. The Algorithm to Understand, Evaluate, and Predict the Performance of an ATR System 

2.1. The Idea to Evaluate an ATR System's Performance 

Because the ATR system is flexible and many constituent components interact in a complicated way, 
it is impossible to model an ATR system's output as the function of all the effective factors. A more viable 
approach (the idea in this work) is to observe the input and the corresponding output, and to determine 
the comprehensive performance in handling a certain target [26]. In carrying out the theoretics part of 
this work, we follow the listed steps below. 

• The definition of ATR is firstly investigated. The ATR system's output is classified. These are the 
foundation of the entire work. 

• Secondly, an index is proposed to scale the operating condition for recognition. This index 
can be further utilized in developing the performance evaluation index and performance 
evaluation function. 

• Thirdly, a series of evaluation index is developed. The precision, the robustness and the 
independence of the recognition output are measured. 

• The fourth step is building a performance evaluation function. The proposed evaluation 
indexes and the operating condition are integrated. A general conclusion may be arrived at with 
this function. 

• The final step is developing an algorithm to predict the ATR system's performance. 
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Figure 2. The practical way to evaluate an ATR system's performance. 
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2.2. The Definition of Automatic Target Recognition, the Identification of Some Similar Technics, and the 
Classification of an ATR System's Output 

As most researchers will admit, the main component of ATR is a signal processing course which trains 
the system with information regarding the concerned target in advance. The system can then be used to 
make decisions on the input signal about the potential target. Usually, its output is used for further 
decision making or action. Typically, there are three terms relating to this system: "classification", 
"recognition" and "identification". Some scientists have discussed this point [1,3]. Here, ATR is divided 
between automatic target discrimination (ATD) and automatic target identification (ATI). If the feed 
signal contains information from the trained target, the processing course is then called ATR. If there 
is only information from an untrained target in the collected signal, the processing course is called 
ATD, which, in nature, discriminates the signal as "having no information related to any trained target". 
Moreover, if there is no information from any target in the obtained signal, then the processing course is 
called ATI. This, essentially identifies the signal as "having no information related to any target at all". 
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The difference of these three technologies related to ATR is shown in Figure 3. 

Figure 3. The difference among ATR, ATD and ATI. 
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With these preparations, the output of an ATR system can be classified as in Table 1, where 
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"identification", respectively. It can be seen that there are three types of signal fed to the sensor: "target 
A", "an untrained target", and "no target". So, i = 1, 2, 3. There are four types of output of the ATR 
system. So, j — 1, 2, 3, 4. Each false decision in these activities can be classified into a false type, as 
is shown in Table 1 . 



Table 1. For an automatic target recognition system, all the decision types and their names. 
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False [D] 
type I (n 2 2) 


True [D] 

("-23) 


Omitted [D] 

(n 24 ) 


No Target 


False [I] 
type I (n 3 i) 


False [I] 
type I (n 32 ) 


False [I] 
type II (n 33 ) 


True [I] 

(«34) 



The designation of Table 1 is as follows. When the feeding signal containing information from target 
A, "False [R]" is the name of the decision type that the ATR system's output is another trained target 
other than target A, "Omitted [R]" is the name of the decision type that the ATR system's output is 
"cannot figure out the target type," and "True [R]" is the name of the decision type that the ATR system's 
output is target A. 

2.3. Scaling the Condition for Recognition 

In order to judge the ATR systems in an objective way, one must scale the condition for recognition. 
This is measured by a novel developed quantity called "Innovation for recognition (INR)", which, 
through calculating the distance of the samples inside a certain target type and the distance among 
different target types, indicates the degree of difficulty in recognizing a certain trained target. 

Firstly, for the testing samples (testing data) and the training samples (training data), the distance of 
the target's feature column vector between them is considered. 

Suppose there are t\ different types of training targets in the system, the targets are distinguished by 
features in m dimensions, x^ 1 ' v ^ is the feature column vector of the target's testing samples. %\ shows 
the serial number of the target. i 2 indicates the serial number of the sample. Here, i\ = 1, 2, t\, 
%2 = 1, 2, S(iJ). S(iJ) is the total number of the testing samples, x^ 3 '* 4 ^ is the feature column 
vector of the target's training samples. z 3 shows the serial number of the target. i 4 indicates the serial 
number of the sample. Here, i 3 = 1, 2, ti, i± = 1, 2, S(i\). S(i\) is the total number of the 
training samples. As a result, for target i\ and target z 3 , 



(h, h) _ -J-fe, i±)\ T «2) _ v(*3. ii)^ 



1/2 



(1) 
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is the distance of the feature column vector from these two sets of samples. 
Then, the INR of target %\ can be solved by: 



d{i] 



E E E 

l 2 = 1 \ *3 =1 l 4 =1 



E s(»7) 



E 5(4) 

. i 3 = 1 



S(if) /S(i\) 

V-T-tr- E E «a(0 



v 1 ' v 1' \ T 4 = 



s(<W!) E E E <5i(-) 

*2 =1 \ 8 3 =1 »4 =1 



(2) 









*i,»3^n 




(tl-l) 


E 




E 


E 




_n =1 




_ i 3 =i 





where <5i(-) = <J(x^> i2) , x^ 3 ' i4 )), 5 2 (-) = 5{x^ h ' ^\ x^ 1 ' In Equation (2), 



S(7) f ti,i a fri S(%) 

E E E <H- 

«2 = 1 \ »3=1 M = l 



E s(*r 



ii=l 



E 

. *3=1 



(3) 



indicates the distance between the feature column vectors from (a) the testing samples of the training 
target %x, and (b) the training samples of all the training targets except for target i x . Here, 



! S(»7) /5(i*) 

12=1 \ 14=1 



(4) 



shows the distance between the feature column vectors from (a) the testing samples of the training target 
i\ \ and (b) the training samples of the target i\. fal-n brings in an average among t\ — \ targets. 
The normalized form of INR is used generally, 



d(ii) = d(ii) / d 0 



(5) 



where d 0 = max yd is the maximum value for all possible operating conditions. For a certain ATR 
system handling a certain target, the lower the INR, the more difficult it is to perform the recognition task. 

In building the INR index, the related principle is the knowledge of feature extraction in pattern 
recognition, as is detailed in many literatures [67] . 



2.4. Performance Evaluation Indexes 

For a practical ATR system, an accurate and robust output is overwhelmingly welcomed. It is 
important that the result should be independent to the run condition, or at least, should be influenced 
as little as possible. The following capacities are concerned: 

(a) The general approach of the recognition output (GARO). GARO weighs the recognition output, 
on the basis of whether or not it comes up with the desired level on correct decisions. Suppose 
the sample size in Table 1 fulfills the requirements in hypothesis testing. There are two 
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schemes for GARO: naked GARO (n-GARO) and GARO with cost (c-GARO), denoted by I x 
and I 2 respectively, 

nn n 23 n 34 ... 

h = "4 4^ § (6) 

J=2 i=l J=l 

WllWll ^23^23 ^34 ^34 



4 4,jy3 3 

E^li^li E -2; "2/ E -:'.,'':!, 

i=2 i=i i=i 

Here, u;^ > 1, i, j = 1, 2, 3, 4 are the assigned value of cost , usually, w u = cj 2 3 = ^34 = 1- The cost in 
c-GARO is introduced to distinguish the risk of different types of decisions. These costs are empirically 
set according to the scenario. 

If any fraction of I\ and/or I 2 fall(s) into the 0/0 form, it is then set to 1. 

The n-GARO is introduced with the knowledge of "summary measures" in statistics [68,69], while 
the c-GARO is found with the knowledge of "summary measures" in statistics and "numeric analysis" 
in engineering mathematics [68,70]. 

(b) The robustness of the recognition output (RRO). RRO checks whether the operating output 
samples have the same distribution as that in the training course. RRO is revealed by the 
distribution specialty of GARO, through a rank-sum test. The related knowledge is "hypothesis 
testing" in statistics [68,69]. 

Suppose that there are n\ samples of n-GARO in the training course, while there are n 2 samples of 
n-GARO in the testing course, all these samples are obtained under the same INR level. That is to say, 
within the same INR confidence interval. The Wilcoxon rank-sum test is applied [68]. Let Ri stand for 
the rank summation of the training samples, then, 



1 + 



In 



2i?i - nx(m + 1) 
m(ni + 2n 2 + 1) - 2R X 



(8) 



is the normalized RRO. It shows whether the two concerned sample sets are subject to a uniform 
distribution, the idealized value of I 3 is 1. Proof of this point can be found in [68,69]. 

The RRO with cost has not been touched here, as it can be arrived at in a similar manner. 

(c) The independence of the recognition output to condition (IRO). Through the hypothesis test of 
independence, IRO estimates the independence of the recognition output to condition. Here, the 
hypothesis test of independence shows the influence (or impact) of the testing condition on the 
ATR system's performance. The related knowledge is "hypothesis testing" in statistics [68,69]. 

The two sets involved in the test are INR and n-GARO. There are si subclasses in INR and s 2 
subclasses in n-GARO, P(INR = i, n-GARO = j) = Pij ,Vi e [Mi], Vj G [l,s 2 ]. The population 
(INR, n-GARO) has a sample size m. is the sample size when INR is in its i th subclass and n-GARO 
is in its j th subclass, is the sample size when INR is in its i th subclass and all n-GARO subclasses. 
Pi. = rrii./m. m.j and p.j have the similar meaning for n-GARO. Let: 



Sensors 2014, 14 



11323 



* 2 = EE 



i=l j=l 



(rriij - mpj. - p.-,-) 2 
mpi. ■ p.j 



EE 



i=l j=l 



(rriij — mi.m.j/m) 2 



mi.m.j/m 



(9) 



stand for the test statistic, the threshold is rj, then, IRO is arrived at by: 



i 



h= 1 + - |ln (x 2 A) 



(10) 



where A is the variation range of d(ii), the idealized value of I± is 1. 

Further materials related to the hypothesis test of independence can be found in [68,71]. 

2.5. The Way to Understand and Evaluate the Performance of an ATR System 

On the basis of the previous work, the performance of an ATR system can be interpreted and evaluated 
in two ways. One is to list the value of INR and the corresponding evaluation indexes. This can be easily 
realized, but the result can not be understood well by people outside of this field. Another way is to 
introduce a comprehensive function from these parameters, namely, the quality factor of the ATR system 
(QF-ATR) in attacking target i x , 



%i = 1, 2, t\, through applying the Monte Carlo test, the final comprehensive comment may be 
obtained. Here, Q{i\) is the expression of QF-ATR in the calculating course. It should be noted that for 
in calculating QF-ATR, d(ii) is the mean value of its variation interval. When recognizing a certain 
target under a certain situation, the larger the QF-ATR is, the better performance the system maintains. 
In a similar way, the QF-ATR with cost is resolved. 

The QF-ATR index is introduced with the knowledge of "summary measures" in statistics and 
"numeric analysis" in engineering mathematics [68,70]. 

2.6. Predicting the Performance of an ATR System 

Performance prediction work can be classified into three situational categories: (a) forecasting the 
performance for a repeated test with a familiar system and target; (b) predicting the performance for a 
tested system on a newly trained target; (c) figuring out the capability for a new ATR system on a familiar 
or a novel target. As an example, 1\ is chosen as the performance index to be predicted. 

First, to estimate the performance in managing a trained target i\ for a repeated test with INR equaling 
dj(ii), the test records of this target, with consistent INR, are taken from the database. These records are 
the seeds for forecasting work. A term imposed on the newly born set is that, its sample size should be no 
less than the requirements originating from the corresponding hypothesis testing. This term is effective 
here as well as in the following cases. 

Another mission, is that to estimate the performance for the first test in coping with a newly trained 
target i n . In order to proceed with the forecasting work, the database of training output is consulted. 
While the operating conditions are much more substantial, it is supposed that the target's INR states fall 



Q(h) = (Ii(ii)l3{h)h(ii))/d(H) 



(11) 
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into dj(i n ), j = 1, 2, J in the training course. For predicting the performance in a certain state 
d m (i n ), m = 1, 2, M, the training record, whose INR is d m (i n ) ± o, is taken out. These records 
are the seeds for performing the prediction work. Here, o is a reasonable tiny quantity in practice. For 
example, o = 0.05 x dj(i n ). 

The third one, but not the least important, is that to predict the performance for a newly developed 
ATR system, people may consult the systems with similar approaches in processing same or similar 
targets. The procedures are not duplicated as they are similar to those in the previous situations. One 
should be aware that even for the same target within a uniform environment, the INR may be different 
in different systems. 

Once the preparation has been completed, a novel developed prediction methodology, referred to as 
"context-probability (CP)", is applied. CP is useful for estimation and forecasting work in complicated 
systems such as an ATR system, where there are many different variables interacting in a complex 
fashion that can not be figured out in clear expressions. In addition, the system may provide increasingly 
accurate and robust results by incorporating historical data into the calculations. So, the new measure 
should take into account both sequential information and probability. The procedures are: 

(a) Collecting the seeds for prediction according to their sequence, here, /i(l), 1% (2), h(k) 
are harvested; 

( — 

(b) Calculating the context weight for the collected seeds, oj\{m) = k 
m = 1, 2, k; 

(c) Calculating the probability weight for the collected seeds, o; 2 (m) = A(m 
m — 1, 2, k, where A(m) and A(j) subject to the identical form: 




k 

A(/.) = ttj 



h(t)-[ E h(r)/k 

r = l 



h(j*)-[ E h(r)/k 

r = l 



/i — 1, 2, k\ 



(d) Calculating the general weight for the collected seeds, uj{m) = ^i( m Wm) . m _ ^ ^ _ ; fc; 

E w i(i)^2(j) 

3=1 

k 

(e) Releasing the forecasting result for the system, I x {k + 1) = Yl ^(^hU)- 

j= i 

It is clear that for this kind of weighted average prediction, there is a group of choices for the weight 
average strategy. The above-mentioned way is one of them. The principal requirements for the weight 
average strategy are: (a) the fresher the data point, the larger the weight is; (b) the less distance between 
the data point and the mean value, the larger the weight is; (c) the final weight vector should be a 
normalized one. 

As mentioned before, when one takes the knowledge of probability, statistics, and weighted average 
prediction into mind, a kind of performance prediction method is realized. Aside from this predicting 
method, one can forecast an ATR system's performance by using a machine learning facility called expert 
prediction [72], or with a data processing technology called bootstrapping [73]. In most situations, this 
method outperforms the others in that both the sequence and the probability are considered. 
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The flow diagram of the prediction algorithm is shown in Figure 4. 

Figure 4. The configuration steps of the proposed prediction algorithm. 

Configuration 

■f Work: calculating INR, fixing the number of prediction 
seeds, collecting prediction seeds according to their 
scenario. 

• Case 1 , a repeated test with a familiar system and a 
familiar trained target. 

• Case 2, a familiar system with a new trained target. 

• Case 3, a new ATR system on a familiar or a new target. 



Initialization/Regular processing 
Entry 



Prediction 

seeds 



Weight 



• Prediction 



The next loop. 



Observation 



Note: 

1 . Weight is generated from context weight and probability 
weight. 

2. In the initialization phase, "Prediction seeds" is firstly 
formed with records in the configuration phase. With time going 
on, new observations is admitted in a sequential way. Once there 
are only observation data in "Prediction seeds," this phase is 
finished. The algorithm then steps into regular procedures. 

3. In the regular processing phase, " Prediction seeds " is 
only formed with historical observation record. 



2.7. Summary of the Proposed Methodologies 

As we have witnessed, the compilation of this work has thus far comprised of the performance 
evaluation measure for an ATR system, the performance prediction method for an ATR system, and 
a quantity to scale the operating condition is developed. The proposed methodologies are collected in 
Table 2. The relation among these performance indexes is shown in Figure 2. In Table 2, "SCR" means 
"Scaling the Condition for Recognition". 



3. Experiments 

To validate this novel methodology, a series of simulations have been undertaken. A sampling 
of results follows. Before starting the discussion of the simulation, we should emphasize that the 
experiments here are: (a) to check whether the evaluation conclusion is in accordance with the 
performance inference; (b) to check whether the performance prediction output is proper compared 
to the practical performance; and (c) to validate whether the methodology can be applied to a variety 
range of ATR systems. Therefore, when performing experiments, there are 3 kind of ATR systems being 
tested. The capability of the proposed methods to be applied in various ATR systems is thus validated. 
Moreover, two similar ATR systems are considered. This is to check the ability of distinguishing the 
performance of similar ATR systems in similar scenarios. 
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Table 2. Summary of the proposed methodologies. 



Technology The Proposed Methodologies 



INR. INR is employed to scale the condition for recognition. It is built 
based on the knowledge of pattern recognition. 



SCR 



GARO. There are two types of GARO: n-GARO and c-GARO. GARO 
shows the capability on correct decisions of the ATR system. The related 
knowledge are statistics and engineering mathematics. 

RRO. RRO reveals how close the distribution of the operating output is 
likely to the same distribution as that in the training course. RRO is 
PE-ATR developed based on statistics. 

IRO. IRO estimates the independence of the recognition output to the 
operating condition. IRO is investigated based on statistics. 

QF-ATR. QF-ATR estimates the comprehensive performance of an ATR 
system. It is empirically proposed according to the background of ATR 
and is resolved based on the proposed evaluation indexes. 



PP-ATR 



CR CP forecasts the performance of an ATR system. CP is developed based 
on the knowledge of random processes and regression. 



3.1. The ATR Algorithms Setup and the Data 

The proposed methodology in this work can be applied to all ATR systems and algorithms. However, 
the algorithms under consideration in the experiments are limited. There are 4 ATR algorithms taken into 
account: a SAR ATR method based on a global scattering center model [74], an improved approach for 
target discrimination in high-resolution SAR images [75], and an electrocardiograph (ECG) waveform 
recognition algorithm based on sparse decomposition and neural network (NN) [76]. They are named 
as Sysl, Sys2, and Sys3A respectively; a modified electroencephalograph (EEG) signal recognition 
measure based on empirical mode decomposition (EMD) and autoregression (AR), namely, Sys3B, 
is developed and validated to compare the performance results as in Sys3A. 

Sysl is configured according to [74] (recognizing targets I, II, and III, and is referred to as recognizing 
target 1, 2, and 3 in this work). Sys2 is implemented from [75] (recognizing target 6, 7, and 9, and is 
referred to as recognizing T6, T7, and T9 throughout this work). Sys3A is accomplished in conformity 
to [76] (recognizing P Pulse and T Pulse in this work). The EMD subsystem of Sys3B in feature 
extraction is directly implemented with respect to the EMD subsystem in [77]. The classifier in Sys3B 
is realized according to the classifier in [78]. The other subsystems in Sys3B and Sys3A are identical. 

Sysl and Sys2 are trained and tested with the data from [74,75], respectively; while Sys3A is trained 
and tested with the data from PhysioNet [79]. Sys3B is applied to the same data as in Sys3A. 

The EEG data of University of California Irvine (UCI) arises from a large study to examine EEG 
correlates of genetic predisposition to alcoholism [80]. It contains measurements from 64 electrodes 
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(medical sensors) placed on the scalp sampled at 256 Hz. Both the training portion and the test portion of 
the large data set are applied. The ECG data from PhysioNet applied are ECG [Class 1; core] long-term 
ST database. 

3.2. Selected Simulation Results and Analysis 

3.2.1. Partial Results (Performance Evaluation and Performance Prediction) and Analysis 

Some of the performance evaluation results are given in Table 3, while the performance forecasting 
results and validation there of are shown in Table 4. The "PE" in these tables means "performance 
evaluation." For each record of the performance evaluation indexes, the original sensing and recognizing 
sample size is 150 times. As for each record in Tables 3 and 4, it is obtained using the Monte Carlo test 
with 50 runs. The principle of performance model based on fuzzy integration (PM-FI) is detailed in [9]. 
The performance indexes are considered in PM-FI. The weight in these three indexes are all 

set as 1. 

Table 3. Partial performance evaluation results for the mentioned ATR algorithms. 

ATR System 

Metrics 



Sysl Sys2 Sys3A Sys3B 

INR(d(ii)) 0.31 0.79 0.36 0.36 

PE index (/1//3//4) 6.14/0.82/0.83 11.03/0.67/0.75 10.06/0.81/0.66 7.02/0.75/0.78 

QF-ATR (Q) 13.99 6.95 14.85 11.33 

RR 0.71 0.82 0.85 0.83 

PM-FI 0.76 0.78 0.82 0.82 



In Table 3, several interesting conclusions can be drawn. First, the recommended methodologies can 
offer well-founded judgment for the system, as long as the operating condition is varying. Secondly, the 
QF-ATR consider the performance not only with the output, but also with the operating condition. For 
example, the Ii level of Sys2 is much better than that of Sysl. At the same time, the value of 7 3 and 
7 4 from these two systems are almost similar. It is unfortunate, that QF-ATR of Sys2 is about half of 
Sysl. The reason lies in the condition, as is indicated by INR. Third, this facility can clearly discriminate 
between systems when they handle identical targets under identical conditions. The evaluation results 
from Sys3A and Sys3B support this point. It is sure that EMD and AR methods maintain less relevance 
with the condition than sparse decomposition and neural network methods. The figures are in accordance 
with the inference. 

In Table 4, the gap between the forecasting result and the actual output is slim. However, we 
should pay attention to the fact that each record is the mean value of 50 original performance prediction 
runs. The prediction error at each prediction step is still clear, as is shown by figures in the following 
subsections. The result in Table 4 is exciting. It is obvious that the prediction error of QF-ATR is much 
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stronger than the other indexes. This stems from that QF-ATR is the function of the other variables. All 
the error will be collected into QF-ATR. 



Table 4. Performance forecasting results and validation. 



INR, PE Indexes 




ATR System 




and QF-ATR 


Sysl 


Sys2 


Sys3A 


Sys3B 


INR (d(»i)) 


0.31 


0.79 


0.36 


0.36 




0. 10/U.Oo/U.ot 


11.04/0.70/0.79 


9.92/0.61/0.80 


/ . 1D/U. / j/U.oj 


PE index {hi hi I a, <) 


6.14/0.65/0.80 


10.98/0.69/0.83 


9.88/0.61/0.82 


7.11/0.73/0.82 


QF-ATR (Q, >) 


11.24 


7.76 


14.17 


12.11 


QF-ATR (Q, <) 


10.35 


7.97 


13.82 


11.96 


UB-RR (>) 


0.75 


0.85 


0.90 


0.90 


LB-RR (>) 


0.70 


0.80 


0.85 


0.85 


RR(<) 


0.70 


0.83 


0.86 


0.87 



Legend: >: forecasted output. <: tested result. UB-RR: upper bound of recognition rate. LB-RR: lower 
bound of recognition rate. 



It may seem unusual that the QF-ATR can not strictly subject itself to Equation (11) with the listed 
/1//3//4 and INR. This stems from the fact that all indexes in Tables 3 and 4 are processed individually 
through the Monte Carlo test. The data has been derived individually from the mean value from each 
50 run test. The performance prediction is performed using CP only. 

Because the scenarios are not complicated, the prediction results of RR have high precision. 

3.2.2. Performance Evaluation with ROC Method and Analysis 

The evaluation results with ROC method are presented in Figure 5. Here, QDD is "quadratic distance 
discriminator". WQDD is "weighted quadratic distance discriminator". For Sys2, Sys3A and Sys3B 
in Figure 5, it may seem unusual that the RR is little decreasing while Pfa is greater than a certain 
value and growing. This stems from the fact that the clutter is too heavy to be effectively processed in 
those scenarios. 

3.2.3. Performance Evaluation with "Confusion Matrix" Method 

The evaluation results of confusion matrix method are shown in Tables 5 and 6. In Table 5, "Tl" 
means "Target 1". The other targets are with the similar name. Here, the settings of the targets for Sysl 
are: signal to noise ratio (SNR) is 10 dB, elevation is 10° and the result is arrived at with 500 Monte 
Carlo simulations [74]. The result of Sys2 is "Experiment and analysis od data provided by the Institute 
of Electronics, Chinese Academy of Sciences" [75]. In Table 6, "P Pulse" and "T Pulse" are different 
waveforms which have implications in medical science. 
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Figure 5. Performance evaluation results of the mentioned ATR systems with ROC 
approach, (a) The recognition rate of Sysl when SNR is changing; (b) The recognition 
rate of Sys2 when false alarm setting is changing; (c) The recognition rate of Sys3A and 
Sys3B when false alarm setting is changing. 
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Table 5. Performance evaluation results with "Confusion Matrix Method" for the mentioned 
ATR algorithms Sysl and Sys2. 







Sysl 










Sys2 






Target 


Omitted [R] 


Tl 


T2 


T3 


Target 


Omitted [R] T6 


T7 


T9 


Tl 


2 


443 


50 


5 


T6 


0 


35 


2 


1 


T2 


3 


20 


475 


2 


T7 


0 


1 


33 


4 


T3 


13 


6 


11 


470 


T9 


1 


0 


1 


36 



Table 6. Performance evaluation results with "Confusion Matrix Method" for the mentioned 
ATR algorithms Sys3A and Sys3B. 



Signal 




Sys3A 






Sys3B 




Omitted [R] 


P Pulse 


T Pulse 


Omitted [R] 


P Pulse 


T Pulse 


P Pulse 
T Pulse 


2 
3 


59 
2 


4 
60 


5 
0 


58 
4 


2 
61 
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3.3. More Simulation Results With Brief Analysis 

To clearly show the capability of the methodology, a mere fraction of the simulation results 
is presented. 

The primary setting of the performance evaluation experiments has been collected in Table 7, where 
"Figure 6, 0.31" means the INR in Figure 6 for the corresponding system is 0.31. The remaining items 
follow this rule. 



Table 7. The primary setting of the performance evaluation experiments. 



Metric 




ATR System 




Sysl 


Sys2 Sys3A 


Sys3B 


INR 


Figure 6, 0.31 
Figure 7, 0.77 


Figure 6, 0.79 Figure 16, 0.36 
Figure 7, 0.56 Figure 17, 0.40 


Figure 16, 0.36 
Figure 17, 0.40 



The primary setting of the performance prediction simulations has been collected in Table 8, where 
"Figure 8, 12" means the number of prediction seeds in Figure 8 is 12. The remaining figures follow 
this rule. 

3.3.1. More Performance Evaluation Results on Sysl and Sys2 

The step-by-step performance evaluation results are presented in Figures 6 and 7. As demonstrated in 
Figures 6 and 7, even for a certain ATR system regarding a certain target under a certain condition, the 
performance shakes. However, the difference exists in the shaking range between different systems. 

The upper-left part (Figures 6 and 7) suggests that the I\ of Sysl is much more robust than the I\ of 
Sys2. For the I 3 and the 7 4 , Sysl and Sys2 are similar in the first scenario. Moreover, there is a modest 
difference in the 7 3 and the I4 from Sysl and Sys2 in the second scenario. One should be aware that each 
data point in performance evaluation is arrived at from N ATR tests in practice, as is shown in Table 1, 
as well as subjecting to Equations (6), (8), (10) and (11). 

In Figure 6, it may seem unusual that the I\ of Sys2 is much better than those of Sysl, while the 
QF-ATR of Sysl overwhelms those of Sys2. The reason lies in the difference of INR, which shows that 
the recognition condition is much worse for Sysl than it is in Sys2. 

As presented in these data, the performance of Sysl is more robust than Sys2 in these two scenarios. 

3.3.2. More Performance Prediction Results on Sysl and Sys2 

Detailed performance prediction results of the above-mentioned Sysl and Sys2 are given accordingly 
(Figures 8-15). It can be seen that the performance prediction algorithm developed in this work is able 
to forecast the performance of an ATR system. One should note that each predicted data point here and 
thereafter is arrived at from a different number of prediction seeds (shown in Table 8), and subjects to 
the prediction procedures. The actual output is also obtained from N tests (shown in Table 1). 
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From Figures 8-10, Figures 12-14, one may know that, for a given ATR system under a certain 
condition, the fewer the prediction seeds, the more flexible of the prediction ability. In most occasions, 
the especially poor match of the prediction results exists in the initial part. As the prediction continues, 
the error turns to decline. 



Table 8. The primary setting of the performance prediction experiments. 



Name of the 




ATR System 




Parameter 


Sysl 


Sys2 


Sys3A 


Sys3B 


INR 


0.31 


0.79 


0.36 


0.36 


Number of 


Figure 8, 12/ 


Figure 12, 12/ 


Figure 18, 12/ 


Figure 22, 12/ 


Prediction Seeds 


Figure 9, 6/ 


Figure 13, 6/ 


Figure 19, 6/ 


Figure 23, 6/ 




Figure 10, 3 


Figure 14, 3 


Figure 20, 3 


Figure 24, 3 


INR 


0.55 


0.63 


0.49 


0.49 


Number of 
Prediction Seeds 


Figure 11,3 


Figure 15, 3 


Figure 21,3 


Figure 25, 3 



Figure 6. Detailed performance evaluation results of Sysl and Sys2 for 50 runs, with 
INR_Sysl = 0.31, INR_Sys2 = 0.79. (a) Detailed results of h; (b) Detailed results of 7 3 ; 
(c) Detailed results of J 4 ; (d) Detailed results of Q. 
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Figure 7. Detailed performance evaluation results of Sysl and Sys2 for 50 runs, with 
INR_Sysl = 0.77; INR_Sys2 = 0.56. (a) Detailed results of h; (b) Detailed results of J 3 ; 
(c) Detailed results of h; (d) Detailed results of Q. 
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Figure 8. Detailed performance prediction results of Sysl for 50 runs (INR = 0.31, the 
number of prediction seeds is 12). (a) Predicted and actual I\\ (b) Predicted and actual 7 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 9. Detailed performance prediction results of Sysl for 50 runs (INR = 0.31, 
the number of prediction seeds is 6). (a) Predicted and actual Ix, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 10. Detailed performance prediction results of Sysl for 50 runs (INR = 0.31, 
the number of prediction seeds is 3). (a) Predicted and actual Jj; (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 11. Detailed performance prediction results of Sysl for 50 runs (INR = 0.55, 
the number of prediction seeds is 6). (a) Predicted and actual Ix, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 12. Detailed performance prediction results of Sys2 for 50 runs (INR = 0.79, the 
number of prediction seeds is 12). (a) Predicted and actual Ii, (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 13. Detailed performance prediction results of Sys2 for 50 runs (INR = 0.79, 
the number of prediction seeds is 6). (a) Predicted and actual Ii, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 14. Detailed performance prediction results of Sys2 for 50 runs (INR = 0.79, 
the number of prediction seeds is 3). (a) Predicted and actual Jj; (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 15. Detailed performance prediction results of Sys2 for 50 runs (INR = 0.63, 
the number of prediction seeds is 6). (a) Predicted and actual Ii, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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3.3.3. Further Performance Evaluation Results on Sys3A and Sys3B 

Figures 16 and 17 represent the performance evaluation results on Sys3A and Sys3B. Because the 
target being recognized is the same one and the designation of these two systems is similar, the actual 
outputs from Sys3A and Sys3B maintain a similar tendency. However, the performance marks from the 
proposed method are different. This result confirmed that, even in a challenging evaluation work, the 
newly developed methodology is suitable for evaluating the ATR system's performance. 

In the upper-left part (Figures 16 and 17), the I\ of Sys3A is much more better than that of Sys3B. For 
the I 3 and the J 4 , Sys3A and Sys3B are similar in these two scenarios. In the lower-right part (Figures 16 
and 17), it is clear that the QF-ATR of Sys3A overwhelms those of Sys3B. 

3.3.4. Further Performance Prediction Results on Sys3A and Sys3B 

Detailed performance prediction results of the above-mentioned Sys3A and Sys3B are given 
respectively (Figures 18-25). These results confirmed that the proposed performance prediction method 
works well in forecasting the performance of Sys3A and Sys3B. While error exists in individual parts, 
the predicting accuracy is almost as well as that can be expected. 
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Figure 16. Detailed performance evaluation results of Sys3A and Sys3B for 50 runs, with 
INR_Sys3A = 0.36, and INR_Sys3B = 0.36. (a) Detailed results of h; (b) Detailed results 
of J3; (c) Detailed results of 1 4; (d) Detailed results of Q. 
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Figure 17. Detailed performance evaluation results of Sys3A and Sys3B for 50 runs, with 
INR_Sys3A=0.40, and INR_Sys3B=0.40. (a) Detailed results of h; (b) Detailed results of 
I 3 ; (c) Detailed results of J 4 ; (d) Detailed results of Q. 
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Figure 18. Detailed performance prediction results of Sys3A for 50 runs (INR = 0.36, the 
number of prediction seeds is 12). (a) Predicted and actual Ix, (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 19. Detailed performance prediction results of Sys3A for 50 runs (INR = 0.36, 
the number of prediction seeds is 6). (a) Predicted and actual Ji; (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 20. Detailed performance prediction results of Sys3A for 50 runs (INR = 0.36, 
the number of prediction seeds is 3). (a) Predicted and actual Ix, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 




Figure 21. Detailed performance prediction results of Sys3A for 50 runs (INR = 0.49, 
the number of prediction seeds is 6). (a) Predicted and actual Jj; (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 22. Detailed performance prediction results of Sys3B for 50 runs (INR = 0.36, 
the number of prediction seeds is 12). (a) Predicted and actual Ii, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 23. Detailed performance prediction results of Sys3B for 50 runs (INR = 0.36, 
the number of prediction seeds is 6). (a) Predicted and actual Jj; (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 24. Detailed performance prediction results of Sys3B for 50 runs (INR = 0.36, 
the number of prediction seeds is 3). (a) Predicted and actual Ix, (b) Predicted and actual I 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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Figure 25. Detailed performance prediction results of Sys3B for 50 runs (INR = 0.49, 
the number of prediction seeds is 6). (a) Predicted and actual Jj; (b) Predicted and actual J 3 ; 
(c) Predicted and actual J 4 ; (d) Predicted and actual Q. 
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3.4. Comparison between the Existing Technologies and the Proposed Methodologies in Performance 
Evaluation For an ATR System 

Based on the materials presented above, a comparison between the existing technologies and 
the proposed methodologies is performed in Table 9. As afore-mentioned, most of the existing 
performance prediction methods are extending work of performance evaluation technologies. Therefore, 
the comparison between performance prediction methods is not presented. Readers are encouraged to 
finish this work. The meaning of some symbols are list below. 

• LI: Is the operating condition considered in the evaluating course? 

• L2: The objectiveness of the evaluation result. 

• L3: The effectiveness of the method in revealing the performance from various aspect. 

• L4: The generalization of the method. 

• L5: Is the method easy to configure? 

Table 9. The comparison between the existing technologies and the proposed methodologies 
in performance evaluation for an ATR system. 



Aspect Existing Technologies Newly Proposed Methodologies 





ROC 


CM 


RR 


PM-FI 


GARO 


RRO 


IRO 


QF-ATR 


LI 


X 


X 


X 


V 


* 




V 


V 


L2 


▲ 


▲ 


▲ 


* 


▲ 


▲ 


• 


• 


L3 


▲ 


▲ 


■ 


• 


• 


• 


• 


• 


L4 


• 


• 


• 


* 


• 


• 


• 


• 


L5 




* 


★ 


▲ 


* 


• 


• 


• 



Legend: The following symbols are effective in all the tables in this work, ir: high achievement; 

: satisfactory; ▲: improvement needed; ■: unsatisfactory; A: should be considered according to the 
corresponding method. ^J: yes; x: no. 

3.5. Discussion 

As can be seen from the aforementioned data, the proposed methodology can offer reasonable 
performance evaluation and performance prediction results for the ATR systems. To ensure a practical 
and reliable mechanism, there are still some extended topics related to this work. 

First, for some ATR systems, it may be difficult to determine INR. The features for recognition may 
be indistinct, or cannot be directly converted into variables, e.g., image, voice, smell and similar items 
which are used to recognize animals cannot be scaled into feature vectors. For signals that cannot be 
denoted with feature vectors, the INR is set to 1 temporarily for all s possible situations; then the system 
makes use of those signals, and the s "faked (because the INR has not been considered)" QF-ATR are 
arrived at as Qi, i = 1, 2, s, consequently, for the i th situation, 
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i=l / \ fe=l i=l 




Hi 

Qk 



) 



(12) 



the QF-ATR with INR is then arrived at. 

Second, it is meaningful to settle the sample size and know the degree of confidence in a field test. 
When the risk is assigned in a field test, the sample size and the degree of confidence can be solved by 
hypothesis testing. 

In addition, if the sample size is less than the demand, bootstrapping can offer some help [68,73]. 
4. Conclusions 

To sum up, this work offers a comprehensive performance analysis tool for ATR systems. For various 
system processing an identical target under various condition, the evaluation results by this novel facility 
can reveal the accomplishment of the system by the evaluation indexes and QF-ATR, as is confirmed 
by the experimental results. At the same time, it has no limitations and presumptions imposed on the 
system being considered. 

For a given ATR system, the INR index can scale the operating condition in an objective way; the 
evaluation indexes and the evaluation function serve to interpret the system's accomplishments. The 
QF-ATR factor, like the quality factor in circuits, may reveal the general capabilities of the entire system. 
All the proposed methodologies is suitable for all existing ATR systems. However, the methodologies 
are especially helpful for ATR in radars and photo- sensors. 

While convenient to exercise, this methodology is unfamiliar at first sight since it is newly proposed. 
Although it is still too early to determine whether or not this is the most suitable way to conduct PE-ATR, 
the results it provides will place PE-ATR on a more objective and quantitative footing. It can also serve 
as a reference for performance analysis of similar systems. 

The future research on this topic may origin from: 

• Validation of the methodology with large scale field tests. 

• Application in different ATR systems. 

• Performance evaluation and performance prediction with less samples. 
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Glossary of Abbreviations 



AAF 


amplitude affection factor 


AR 


autoregression 


Auto-I 


automated instrumentation and evaluation 


CFAR 


constant false alarm rate 


CM 


confusion matrix 


ECG 


electrocardiograph 


EEG 


electroencephalograph 


EMD 


empirical mode decomposition 


EOC 


extended operating condition 


EP 


expert prediction 


ER 


extend of recognition 


FPGA 


flexible field programmable gate array 


IR 


infrared 


LB-RR 


lower bound of recognition rate 


MFRR 


measurement of false recognition rate 


MOE 


measures of effectiveness 


MRR 


measurement of recognition rate 


NN 


neural network 


PE 


performance evaluation 


PE-ATR 


performance evaluation for ATR systems 


PM-FI 


performance models based on fuzzy integration 


POLInSAR 


polarimetric SAR interferometry 


PP-ATR 


performance prediction for ATR systems 


QDD 


quadratic distance discriminator 


ROC 


receiver operating characteristic 


RR 


recognition rate 


SAR 


synthetic aperture radar 


SCR 


scaling the condition for recognition 


SNR 


signal to noise ratio 
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SNRF signal to noise ratio affect factor 

START the scoring, truthing, and registration toolkit 

UB-RR upper bound of recognition rate 

UCI University of California Irvine 

WQDD weighted quadratic distance discriminator 
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